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s^scnoNi 

INTRODUCnON 



Since the inception of the Achievement Tetting Program in 1982, students in 
French Immersion programs have had the option of beinff exempted from the 
tests. However, more and more superintendents wantndl participation for 
their French Immersion students. As a result, the number who take 
achievement tests has risen dramatically, and now over 80 per cent of French 
Immersion students in grades 6 and 9 write the testa. 

Many critical issues and methodological problems surround the assessment of 
French Immersion students, however. Because of the importance of this matter 
to studente, pu-ents, educators, and the public at large, we believe that these 
issues and problems should be discussed by the broadest possible audience 
before changes are made to the existing testing program. 

The purpose of this paper is to help focus the discussion, using achievement 
data collected recently. We believe tiiat the new information inttiis report and 
the discussion that arises from it will help us find the beat course of action. 

Section 2 of this paper provides an overview of these issues and methodological 
problems. These must be resolved to ensure that French Immersion stuctents 
can be asMssed, and the results reported and interpreted, validly and reliably. 
Section 3 describes the results of a special study that was done by the Student 
Evaluation Branch to find out if language of testing is a variable that affects 
the way French Immersion students respond to test questions. Section 4 
presents other related information that has been collected through the 
Achievement Testing Program already. The final section summarizes the 
information that is presented and identifies possible future directions. 



SB€rnON2 

OVERVIEW OF laS UBSRgLATBD TO ASSBS 
STUDENTS INOTRUCTED IK FRENCH IMMERSION PROGRAMS 



BASIC ISSUE 

The main issue that needs to be resolved when assessing students in French 
Immersion programs is: 

Should students instructed in French Immersion programs be €tssessed and 
the results reported on a provincial basist 

Many pedagogical, technical, and political factors underlie this issue. These 
often conflicting or competing factors need to be understood and addressed for 
this issue to be resolved successfully. For example, while we want meaningful 
and useful information about student achievement, we want to ensure Chat 
French Immersion programs are not tmdermined Inr any psjrchometric practice 
or process. Yet we know that meaningful and useml achievement information 
can only be provided if assessment practices are valid and reUable. 
Unfortunately, what is most valid and reliable statistically may not be seen to 
be appropriate politically or pedagogically. 

As with any issue as complex as this one, a number of related sub-issues must 
be addressed before we can resolve the main issue. 

1. Shoxild participation in the Achievement Testing Program be 
mandatory for students in French Immersion programs? 

2. In what language should French Immersion students be tested? 

3. Should all French Immersion students be tested in the same 
language? 

4. Should the achievement of French Immersion students be assessed 
using the same tests that are used in the reg^ar Achievement 
Testing Program? 

5. Against what standards should the achievement of French 
Immersion students be compared? 

6. Against what reference group should the achievement of French 
Immersion students be compared? 



RELATED ISSUES 
Each of these issues is discussed below. 

1. S/uiM partidpc^n in the Aehieoement Tttting Program be mandatory 
forttuaenitrnFirneklmmertionprogramtt 

Achievement testing for French Inunerdon students is currently optional, 
yet most school administrators choose to have their students participate, 
u * , ?^ Participation sugoests that these educators want to Imow 

about the levels of aduevement m taeir French Immenion progruns. 
Fresumably, this mformation is helpful in making decisions about the 
program. But to make such educational decisions, one needs to know not 
onlpr the levels of achievement but also if those levels are "good enourfi". 
ims type of judgment cannot be made without comparing test scores to 
some pomt of rrference. In the case of French Immersion, that point of 
reference could be provincial standards and/or average scores for French 
Immersion students. 

To be meaningful and useful as a point of reference, provincial test scores 
must be representative of the population under investigation. "Hie results 
are representative only if everyone in the population or a representative 
sample of that population is tested. These conditions cannot be met as 
lOM as participation in the Achievement Testing Program remains 
optional for French Immersion students. 

If participation in the Achievement Testing Program were mandatory for 
au students m French Immersion programs, then the only students who 
cwiia be excused from participating would be those for whom a particular 
test was inappropriate. This would include students who were enrolled in 
a special needs program or those whose instruction in the course being 
tested was in another semester or year. 

Educators opposed to mandatory participation argue that because of the 
nature of Frendi Immersion programs (their recency and variability in the 
amoimt of daily mstruction in French, for example), there is a high 
potential for misinterpretation of achievement test results. They feel that 
misinterpretation, if^ negative, could jeopardize support for these 
programs. Thus, they believe they should have the opportunity to opt out 
of a testing program. v^^i- 

The counter-argument is that it is precisely because of the recency and the 
ISl^'^l^^u ^ Immersion pronams that achievement in these 

SiSSSf J^l"^** momtored closely. . Without such information, it is 
d^cult to know the relative stiWths and weaknesses of the program. In 

where desiraWe, is hmited. In short, rather thMf fearing negative 
of m^?d«SI^ *if unexpected or unfavorab^suppSSrs 

IaI^S^^^ argue that one should use that information tomake 

adjustments or improvements to the program. ""—^c 
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2. In what language should French Immersion students be tested? 

Participation in the Achievement Teating Program ia optional for thoae 
studenta whoae language of inatruction ia other than Engliah and/or those 
students who are enrolled in an Engliah as a Second Language program. 
Implicit in these categories of exemption ia the notion that m order for 
participation to be appropriate for any given atudent» hia or her language 
of inatruction and language of fluency muat match the language of teetinff. 
A match between the language of inatruction and language of testing is 
important because conclusions about atudent achievement may only be 
vaUd if the test questions aaaeas what atudenta have learned in that 
languaffe. Testing students in their language of fluency is alao important 
to vahdi^ because what atudenta know and can do can only be fairly 
assessed if examinees can easily read and understand the test questions. 

In French Immersion classes, the language of instruction and the language 
in t^ch most students are fluent are not neceaaarily the aame. This poses 
a dilemma in chooainff a language of testing because, regardless of which 
language is chosen, there is a potential nak. Choosing to test in the 
language of instruction when it is not the language of fluency coxild cause 
scores to be artificially low. These artificially low scores could cause 
negative political and pedagogical imphcationa if they are interpreted to 
mean that lower achievement is both imexpected and unacceptable. On 
the otiier hand, a negative message could be aeen to be aent to educators 
and the public about our confidence in our students' capabilities if French 
Immersion students were to be assessed in En^ish (i.e. the language of 
fluency) rather than French, the language of instrurtion. 



3. Should all French Immersion students be tested in the same language? 

An issue that is related to the question "In what language should French 
Immersion students be tested?" is whether or not aU French Immersion 
students should be tested in the same language. If the answer is that the 
language of testing shoxild be variable, then a secondary iasue concerns the 
level (individual, class, school, jurisdiction) at ifdudi the language of 
testing option should be aUowed to occur. 

The primary argument for mandating a single language of testing in 
Frendi Immersion programs is that this decision wouldhold constant a 
variable that, it has been arffued, affecta how atudenta respond to test 
Questions. English testa :jid their French translation may not be equally 
difficult, because the translated questions mirfit be easier or more difficult 
to answer. Hiis meana that stuclents coultt have more or less trouble 
selecting the correct answer depending on which form of the test they 
write, the English original or ita French tranalation. In other worda, their 
scores could vary dependinf^ on the language of testing. Similarlv, 
students may have more or less trouble reading and underatanding tlie 
questions when they are presented in one language rather than the other 
because of unequal levels of first and secozid language reading ability. 
Since the selection of a correct answer to a test qu«iSon depends in part on 
the ability to understand the question, then once again, student scores 
could vary depending on the language of testing. 
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Mandating one language of teetisg province-wide -mH not eliminate 
whatever effect choosing that particular languaffe has on the students' 
ability to respond, but what it will do is hold tEat effect constant. For 
example, if scores are lower than ezpeeted becaui|^ tiie form of the test 
chosen is more difficult, at least all groups of Frthch Immersion students 
who are tested will suffer the same score depression^ making it easier to 
interpret test scores. Essentially, the argument against mandating one 
language of testing is that such a policy may be seen as insensitive to local 
needs and conditions. 



4. Should the achievement of French Immerwion ttudents be a$m$oed udng 
the mme te9t$ that are used in the regular Achievement Teeting Program f 

A strength of the Achievement Testing Program is that the tests are based 
on Alberta programs of study. Because of this match between what is 
expected and what is tested, the rMuIts of testinff can be used in 
coz^junction with other sources of information to ma&e meaningful and 
xiseful decisions about cturriculum, resources, instruction, ana so on. 
Obviously, the goal of French achievement testing is to have results that 
are eoually meaningful and useful, and therefore tests must be 
curricularly valid. In general, the learning objectives in the French 
Immersion programs are the same as those in the regular (English) 
pro-am. This implies that, at least in principle, the English program 
achievement tests are also a fair representation of French Immersion 
pro-am ciirricula. What may differ are the expected standards of 
achievement ¥athin a French Immersion program or subtleties in the way 
a curriculum is interpreted, for example. The question, then, is whether or 
not these differences are great enou^ to warrant producing separate 
achievement tests for the French Immersion programs in science, 
mathematics and social studies; language arts tests are currency producea 
separately for the English and FVencn programs. 

Assuming that the ctirricula match across programs but that the standards 
of performance varv, it is possible to address wis difference without having 
to use separately develoi>ed tests. In the case of multiple^oice testd, au 
that would be required is to select different scores as the cut-off points 
rei>resenting the expected standards of achievement in each program. For 
written-response items, the scoring standard could be a^just^ without 
having to alter the nature of the assessment tasks. 

Three factors support the use of Emrlish program tests with French 
Immersion students. Eixat, by using the same tests, it is possible to 
compare achievement between the two programs. It is precisely this point 
that causes some to arnie for separate tests, since thev see this form of 
comparison as undesirable. Secon^ t it is more cost efficient to translate 
tests than it is to produce separate French and Eni^iish versions. Thus, if 
the content of fin^^ish program tests is represtmtative of French 
Immersion curricula, then it is more desirable to use resources for other 
initiatives rather than to produ^^ essentially parallel tests in two 
languages. Finally, since the techmcal merit of a test is determined in part 
by the size of the population in which it can be field tested, then English 
program tests are likely to be more valid and reliable indicators of 
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achievement than test foms developed separately for French Immersion 
etudents would be. Therefore^ whatever ytUdify that mififet be lost by 
using translations of En^^h program testa would be gamed by using 
instruments that are technical^ more sound. 

The essential argument for not using English proflram testa in their 
translated form is that the level of language used on thoet translations is 
too difiGcult for many French Immersion nudents. The level of language 
used on the Ensdish forms, and therefore the French translations, is that 
which is readable to a native language speaker the ase and grade level 
being tested. It follows that because Frradi is a secona and not a native 
language, French Immersion students may be unable to fully comprehend 
the test material. 

The argument follows that if separate tests were developed for French 
Immersion students, then levels of language aroropriate for second 
language speakers could be used. Ihe limitation of uis argument is that if 
the text were simplified to the extent that second language readers could 
comprehend it, it may also be simplified to the point where the difficulty of 
the material being tested on English and fiVench trrts is no lonmr 
parallel. This parallelism could be lo£t because the abstractness of Uxe 
concepts being measured on a test is closely related to the language used to 
express those concepts. In other words, the level of subjectHq>eeific content 
being tested couid be reduced by simplifying the language used on fhe test. 

5. Ag€un9t what standards should the adtievemmt of French Immersion 
students be comparedt 

By themselves, achievement test scores have no meaning. To have 
meaning, they must be given a context. One way of providing a context is 
to compare those scores with expected levels of performance. This process 
requires two distinct judgments. The first is to oetermine what percentage 
of students tested can be expected to achieve at least an acceptable level of 
skill and knowledge. The other is to establish the test score that 
represents that level. 

For French Immersion scores to be reported meaningfully, this comparison 
to standards seems essential. Thus, the two judgments referred to above 
must be made. An issue that arises out of this concern is whether or not it 
is reasonable to expect French Immersion and regular (English language) 
program students to achieve the same standard of performance on leaminr 
objectives that are common. One line of reasoning is that French 
Immersion students divide their attention among five academic subjects 
(Enghsh tanguage arte, French language arts, sodal studies, mathematics, 
and saence) rather than four, and thus ezpecUtions should be lower for 
these students. Another line of reasoning is that since participation in 
French iTOTiersion programs is decided by parents, these students may, in 

Sneral, come firom more supportive home environments than students in 
, « J^«f^«f program, and ttius expectations for this group should be 
mgjier. whatever decision about standards is made, a technical question 
that arises is how best to set standards so that appropriate expectations 
exist on French and English fomui of a test. 
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6. Against what reference group should the achievement of trench Immersion 
students be compared^ 

Another w-ay to provide a context for test seorec is to compare the 
performance of the group in question against that of a reference group. In 
the case of the current Achievement TestiiBg Program, there is no 
appropriate reference group at the proidndal levS, since provincial 
averages are based on the achievement of the entiiv pc^pulation of students 
who were tested. However, provincial averages can be used as the 
reference group scores against which individual school and jurisdiction 
levels of performance can be compared. 

Jfu!?* '^fu® °^ Immersion tests scores, the central issue concerns 

T^lif 1 «PPf opnate reference group. From the point of view of 

mdivMi lal schools and jurisdictions, an appropriate reference group would 
be all French Immersion students who wrote the test. Tlus assumes, 
however, that participation in the testing program is mandatory, since 
reference group norms must be based on the scores obtained by all 
members of the population or a represenUtive sample of that population. 
Other possibihhes east, however. If the learning objectives and the 
standards of performance for French Immersion and Endiah language 
JSSS.'n«*? «wwitially the same, then it may be app^riate, 
statistically, to compare French Immersion group averages wM English 
averages. One of the variables that may need tcTbe considerSi in 

romni^K^f /^P/f ^'''"P » language of testing and the 

comparability of forms if separate tests are used for French Immersion and 
Enghsh program achievement testing, since the difficulty of the test affects 
examinee performance and, therefore, the validity of comparisons. 

SUMMARY COMMENTS 

S!"£!ff*'°'' °-kV'* "^"^^ described in this section is essential to ensure that 
tne best possible assessment and reporting plan for students in French 

is implemented, fht^ ot educators deUvering ie^S 
ensuring tb-i the methods used to collect dSa imd 
report on student achievement reflect the needs of students in these programT 

^^rS^^ ^i! r^^ii' Language of Testing Study conducted in 1989 is 
presented Thw study addresses, in part, the effect that language of testimr has 
on student achievement for students Instructed in French iSaeSion ^^ins 

Finally the process of developing rehable and valid achievement tests is 
care% out^ed m a brochure M^Devehmng Achievement Tests, Gi^s 3, 
AcW«vr«.nt^.r^'^ ?l ^J^^'^'^?' P^«"« call the Assistai^t oSector 
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8BCTI0N3 
LANGUAGE OF TBOTING OTUDY 

wniODUcnoN 



In a report prepared for Alberta Education, Carey (1980) noted that when 
^nch Immersion ftudenta reaoond to taat quaationa their reaponaea are 
ahaped not only bv their letrela of knowledge and akill proficiency (the 
attnbutea under atudy) but alao by the nature ofthc teat and by their ability to 
read that U»t. He argued that beeauaa their reaponaea were chaped by these 
factora and these factora could vary depending on Uie languafe of teating, 
f rendi Immersion atudents could achieve different aeorea aepending on 
whether they wrote a test in French or in English. 

Because of the importance of the iasue 7a what language should French 
Immersion students be tetted". Student Evaluation ataff undertook a study to 
determine what effect, if any, language of tea^ng haa on French Immersion 
btudents responses to achievement test queationa. Hua language of testing 
study wtamined how French Immersion students in grades 3 and 6 responded 
to social studies test questions presented in French and in English. 



DESIGN OF THE STUDY 

The language of testing study had two parts. The design of each part is 
presented separately below. 

Grade 3 ifeeiiJ Studi«w 

In June 1989, 664 grade 3 French Immersion students from three urban 
^T^2r^*^'!?*o^°^ orijpnal EngUsh version or a French translation of the 
1988 Grade 3 Social Studies Achizvement Test. All students taking part in the 
study had received their grade 3 aocial at jdiee instruction in French. On 
average, they had rersived approximately 75 per cent of their daily instruction 
m French m grade 3. 

Staff from the Student Evaluation Branch used class lists fivm the 
participating schools to aaaign randomly French or English forms of the test to 
equal numbers of students in each daaaroom. 

The returned test booMets were scored under the direction of Student 
Evaluation Branch ataff. The results were then analyzed to determine if there 
It*7 differences in the scores achieved by the two groups of 
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Grade 6 Sodml 3taidi<Mi 

In Jxrne 1989, 416 grade 6 French Immereion studente from two urban 
mrisdictions took nart in the lan|uage of teiting study. These students, who 
nad all received their social stuaies instruction in Frendi, wrote French or 
English forms of the regularly scheduled 1989 Grade 6 SocicU Studies 
Achievement Test - Part A: Multiple Choice and Part B: Written Response. On 
average, they had received 65 per cent of their daily instruction in French in 
grade 6. 

Staff from Student Evaluation used class lists fi*om the participating schools to 
assign randomly French or English fonr^ of the test to equal numbers of 
students in each classroom. 

The returned answer sheets and written-response test booklets were scored 
under the direction of Student Evaluation staff. Special steps were taken to 
ensure that the same standards were applied to the marking of the French and 
t English written-response questions. Tlie results from the multiple^oice 
and wntten-response portions of the test were then analyzed to determine if 
there were significant differences in the scores achieved by the two groups of 
students. 



RESULTS OF THE STUDY 

Grade 3 Social Str^ XM 

Table 3-1 presents a comparis:>n, by reporting category, of the scores obtained 
by the students who v^rote the English and French forms of the 1988 Grade 3 
Social Studies Achievement Test. The results indicate that the group of 
students who wrote the test in French (F group) achieved significantly lower 
scores than did those who wrote the test in Enuish (E group) on all reporting 
categories except one. These results sunport ^e hypothesis that, in grade 3, 
the responses of French Immersion students are sensitive to the language of 
testing. 

A comparison of E group scores to the 1988 provincial averages indicates that 
there are significant differences for all reportmg categories, lliis suggests that 
the French immersion students tested in this study had levels of social studies 
achievement that were lower than those adiieved by students in regular 
English luguage programs in 1988. These resulis are surprising for two 
reasons, axsi. since the nade 3 students in this study came from the same or 
similar schools and juriscQctions as the students in the grade 6 portion of the 
study, and the levels of nerformance of the grade 6 students were hi^er than 
those achir<^ provindally, \t seems logicaftiist the frade 3 students should 
have achieved average or above average leveb of performance. Second , thes<: 
resuIU do not compare to those firom 1988, in which a group of grade 3 Froich 
Immersion students from a jurisdiction similar to those partiopating in this 
study acbdeved levels of performance that were better uian those achieved 
provmcially. In short, based on other evidence, one would bjive predicted that 
the grade 3 French Immersion students tested in this study would have had 
levels of achievement that were at or above the 1988 provincial levels. 
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Because these results were unexpected, an eiplaniition of their underlying 
r ^use is trarranted. Two voesibiiitieeo^metomind. ine first ii that the 
. rels of achievement of the atudenta in this study are equal to the 1988 
provincial levels but the E group scores do not accurately reflect this. 
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TabU3-l 

Grade 3 Language of Testing Study 1989 
Results by Reporting Category 
for Grade 3 Social Studies 



BJTect Size 

Reporting Average^ 3tandar<i PflYlHtaffTl Relative to 

Category E Group F Group E Group F Group Prov, S.D.^ 



Total Test 


29.3 
(33.8) 


21.8 


8.4 


8.0 


137.3*** 


.83 


Topic A 


9.2 
(11.7) 


8-3 


3.0 


3.0 


17.2*** 


.27 


Topic B 


11.0 
(11.4) 


7.2 


3.2 


3.6 


206.9*** 


1.18 


Topic C 


9.1 
(10.7) 


6.3 


3.7 


3.0 


111.9*** 


.74 


Knowledge and Comprehension 














All Topics 


12.7 
(14.8) 


9.6 


4.2 


3 9 


96.2*** 


.70 


Topic A 


3.7 
(4.8) 


3.7 


1.7 


1.7 


0 9 


0 


Topic B 


4.8 

(5 3) 


3.3 


1.8 


1.8 


112.4*** 


.83 


Topic C 


4.1 
(4.7) 


2.5 


1.9 


1.6 


131.2*** 


.83 


Value Concepts and Valuing 
Skills (All Topics) 


3.3 
(4.0) 


2.3 


1.4 


1.4 


90.1*** 


.68 


Inouiry Skills I 
(All Topics) 


7.6 
(8.7) 


6.2 


2.3 


2.3 


52.7*** 


.56 


Inquiry Skills II 
(All Topics) 


5.8 
(6.3) 


3.7 


2.2 


2.0 


160.8*** 


1.0 



***p. < .001 



jjhe bracketed figures below the E group averages are the 1988 grade 3 provincial averages. 
|F refers to tfie variance ratio, with tk^t appropriate degrees of freedom. 
This is derived from the ratio: Average sc ore for E Group . averafe score for F Grnnp 

Standard deviation of E Group Average 
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This systematic underestimation of grade 3 students' social studies knowledge 
and skill could have resulted from their inability to understand completely 
what the English versions of the questions were asking if they comd not 
understand the questions, they could not demonstrate what they knew and 
could do in social studies. Tfaos hypothesis is based on the assumption that 
because English language arts was not introduced to these students until their 
second or tnird year in school, their reading sidlls in English may Vave been 
insufficiently developed to permit them to comprehend the test questions. 
What limits this hyoothesis is that the resulta of the 1989 Grade 3 English 
L4mguage Arts Acnievement Test administration do not support the 
assumption that the students in this study had limited English lansuese arts 
skills. Those results, presented in the not chapter^ show that the French 
Immersion students who wrote the Efiglish Language Arts Achievtment Test in 
Jime 1989 achieved scores that were approzimately equal to provincial 
averages. This suggests that, in general, grade 3 French Immersion students 
have English language arts skills that are adequate or better. 

An alternative hypothesis to explain the unexpectedlv low E group scores is 
that because 19o9 was not a testing year for grade 3 social studies, the 
emphasis siven to social studies instruction in particular, in the classrooms 
tested in uiis study, was less than in 1988. As a result, the levels of social 
studies achievement in these French Immersion classrooms in 1989 were lower 
than the provincial levels of achievement in 1988. Thin alternative hypothesis, 
while disturbing, could be valid since it suggests a pattern that has been 
argued to exist in English language programs. Ironically, its truth would be 
good news from a measurement pomt of view becaiise then it covld be assumed 
that the scores of the E groun of students were accurate reflections of their 
actual levels of achievement. This in turn would indicate that it is possible to 
assess accurately the levels of achievement in ffrade 3 French Immersion 
classrooms using the achievement tests that are oesigned for regular English 
language students. 

Grade 6 Sq qp^ St^idlfiff 

Table 3*2 presents a comparison of the total test, multiple-choice, and 
written*response scores achieved by the two groups of students who wrote the 
1989 Grade 6 Social Studies Achievement Test. The results indicate that the 
group of students who wrote the French form of the test (F group) achieved 
signihcantly lower scores than did those who wrote the test in ISnglish (E 
group). This suggests that, when their level of social studies achievement is 
measxired, the responses of grade 6 Fnmch Immersion students are sensitive to 
the language of testing. 



A comparison of E group scores to provincial averages indicates that the French 
Immersion students tMted in this study had scores that were significantly 
higher than those achieved by students in the regular English languase 
nroffram. When the F group scores are compared to those achieved by all graae 
^ ^^nch Immersion rtudents who wrote the provincial achievement tent in 
French, no significant differences are found, lliese rMults suggest that the 
level of achievement of the students selected for this studv is representative of 
French Immersion students generally and that this level is somewhat higher 
than that of students in the regular English language program. 
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Table 3 2 

Grade 6 Language of Testing StUily 1989 
Results Inlifajor Reporting Categc^ 
for Grade 6 Social Studies 

~ Effect Size 

Reporting Average^ fji^nAmrA Dtmtn^m Relative to 

Category E Group F Group S Group F Group Prov. S.D.^ 



Total Test 69.9 53.8 14.8 14.5 126.1*** .99 

(62.5) (55.3) 



Part A. Multiple Choice 36.1 27.1 8.2 7.8 134.4*** 1.00 

(32.2) (27.8) 

Part B: Written Reaponee 19.4 15.8 4.8 5.3 49.3*** .70 

(17.5) (16.4) 



***p. < .001 

^The bracketed figures below the E group averages are the 1989 grade 6 
provincial averages. The average scores of the French Immersion students who 
were not part of the language of testing study but who wrote the French 
translation of the 1989 Socuil Studies Achievement Test are presented 

An brackets below the F group scores. 

*F refers to the variance ratio, with the appropriate degrees of freedom. 
^This is derived from the ratio: 

Average score for E Group - average score for F Group 

Standard deviation ofE Group Average 



Table 3-3 presents a comparison of the scores achieved by the E and F groups for 
each of the reporting categories specified by the blueprint for Part A: Multiple 
Choice. The results indicate that r group students addeved significantiy lower 
scores on all reporting categories than did the E sroup students, althoudi the 
Buse of that difference was variable. This suggests Uiat while the effect on scores 
of varyinff the language of testing is systematic across all reporting categories, 
the size of that effect is not constant. 

A comparison of E group and provincial scores suggests that the French 
Immersion students tested in this study achieved levels of D«rferm«nce in social 
studies that were as good as or better than those achieved by most students in 
regular English language programs. At the same time, their levels of 
performance on items related to Topic C were significant^ lower than those of 
students in French Immersion programs general^. ITbii suggests that when it 
comes to their level of achievement in this area, the students in this study are 
somewhat unrepresentative of all Frecch Immersion students in grade 6 social 
studies. 
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Tabu 3 3 

Grade 6 Language of Testing Study 1989 
Rf3ult$ for Part A: Multiple Ckoics 
for Grade 6 Social Studies 















Effect Size 


Reporting 


Averaffe^ 




%indird I>frTiatiifm 


£2 


Relative to 


Category 


E Group F Group 


E Group F Group 




Prov. S.D.3 


ToDic A 


13.2 
(12.3) 


10.8 
(10.9> 


2.7 


2.8 




.OJ. 


ToDic B 


11.4 
(9.7) 


8.4 

(8.3) 


3.1 


2.8 




ft7 


ToDic n 


11.6 
(10.1) 


7.8 
(8.6) 


35 


3.8 


llU.b 


1 A1 

l.Ul 


vnowledge and Comprehension 














All Topics 


15.2 
(14.0) 


12.0 
(12.2) 


4.0 


3.6 


72.6*** 


.76 


Topic A 


5.7 
(5.5) 


5.0 
(4.8) 


1.3 


1.2 


33.8*** 


.51 


Topic B 


4.6 
(4.0) 


3.6 
(3.6) 


1.6 


1.5 


46.5*** 


.59 


Topic C 


4.8 
(4.0) 


3.4 
(3.8) 


2.1 


2 2 


44.0*** 


.65 


Value Concepts and Valuing 

Citilla f All Tnnixa^ 
OKJiXo \rxll lOp^CS/ 


4.8 
(4.1) 


3.8 
(4.0) 


1.2 


1.6 


47.0*** 


.68 


Inquiry Skills I 
(All Topics) 


6.0 
(5.2) 


5.3 
(5.3) 


1.6 


1.8 


17.6*** 


.40 


Inquiry Skills II 
(All Topics) 


5.8 
(5.3) 


3.0 
(3.1) 


1.8 


1.5 


302.1*** 


1.45 


Inquiry Skills III 
(All Topics) 


4.4 

(3.9) 


3.0 
(3.2) 


1.3 


1.5 


107.3*** 


.92 



***p. < .001 

^ The bracketed figures below the E group averages are the 1989 grade 6 
provincial averages. The averaae scores of the French Immersion students who 
were not part of the language of Usting study but who wrote the French 
translation of the 1989 Social Studies Achievement Test are presented 
Jn brackets below the F group scores. 

prefers to theyanance ratio, with the appropriate degrees of freedom. 
This IS derived from the ratio: Average sco re for E Group . averagt> score for F Group 



Standard deviation of E Group Average 
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Table 3-4 presents a comparison of the averages achieved by the E 
and F groups for the two reporting categories specified by the 
blueprint for Part B: Written Response. Hie resulta are parallel to 
those for the multiple-choice portion of the test: while the language 
of testing effect favored those who wrote the test in Englidi, the size 
of that enect varied across the two reporting categorivds. 

Table 3 4 

Grade 6 Language of Testing Study 1989 
Results for Part B: Written Response 
for Grade 6 Social Studies 



Effect Size'" 

^P°^^°8 AvCTfige^ Standard Deviation £2 Relative to 

Category E Group F Group E Group F Group Prov. S.D.^ 

Short Answer 10.2 7.7 3.2 3 6 57 0*** 76 

(9.3) (8.1) 

Composition 9.1 8.2 2.5 2.8 14 2*** 32 

(8.2) (8.2) 



***p. < .001 

^The bracketed figures below the E group averages are the 1989 grade 6 
provincial averages The average scores of the French Immersion students who 
were not part of the language of testing study but who wrote the French 
translation of the 1989 Social Studies Achievement Test are presented 

nin brackets below the F group scores. 

J^efers to the variance ratio, with the appropriate degrees of freedom 
^This IS derived from the ratio: 

Average score for E Grenip . m^rng" <"^ re for F arnuT 
Standard deviation of E Group Average 



DISCUSSION 

The purpose of the laiufuage of testing study was to determine if grades 3 and 6 
French Immersion students achieved different scores depending on whether they 
wrote the soaal studies achievement tests in French or in En^ah. The results 
mdicate that the scores for students who wrote the French forms of the tests are 
consistently and sign^cantJy lower than are the scores achieved by students 
who wrote the test m Eng!iish. okuu^uvo 
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The E and F aproups in both the grade 3 and the grade 6 portions of the study 
were composed of students randomly selected from the same French Immersion 
classrooms. Because of this random assignment of students to groups, it can be 
assiuned that aU conditions (e.g., levels of ability; quality of instruction) except 
the language of testing were the same across the two enmps within each graoe. 
It is therefore possible to attribute the si^uficant dimrences in scores across 
groups to the only variable that changed - the language in which the tests were 
presented and written. 

What makes the language of testing effect statistically as well as educationally 
important is its magnitude. A variation in scores as larffe as that which 
occurred in this stu^ has implications for data interpretation. This is most 
clearlv seen when the 1989 grade 6 provincial averages are compared with the 
E and F group data. Because the E group scores are sisnificantly higher than 
the provincial averages, a reasonable interpretation of the data is that the 
students in this study had levels of achievement that were higher than 
provincial levels. 

From this it is possible to conclude that throu^ French Immersion instruction, 
these students were able to acquire some proficiency in French while at the 
same time achieving levels of academic performance that were equal to or 
better than those in regular English language programs. Iius same 
interpretation and conclusion would not seem reasonable were it based on a 
comparison of the F group and provincial averages, because the F group scores 
are significantly lower. In short, because of the magnitude of the language of 
testing effect, two very different conclusions could be made about theievels of 
achievement of the same group of students. 

Aside from the systematic depression of F group scores relative to E group 
scores, two other trends are notable in the aata. First, the size of the 
differences in E and F group scores varies across reporting categories within 
each grade. Since items are grouped into reporting categories accordixiig to 
their content and c^itive levels, this trend suggests that the size of the 
language of testing effect varies depending on the nature of the questions being 
asked (i.e^ on their topic and tyjfe). The second notable pattern is that the size 
of the differences in scores vanes between similar reporting eateries when 
comparing grado 3 results to grade 6 results. This pattern suggests that grade 
level may be a factor that contributes to how students respond to French and 
English forms of tests. 

These results suggest that the findings fi^m this study may not be generalized 
to other grade levels and subject areas except in general terms. In other words, 
while it can be assumed that the scores of French Immersion students may also 
be sensitive to the language of testing when their levels of achievement in 
grades 3, 6, and 9 mathematics and science are assessed, the magnitude or 
pattern of those differences cannot reasonably be predicted on the basis of these 
results. Further study therefore is needed before decisions are made about the 
language m which French Immersion students should write achievement tests. 
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sscnoN4 

APPROPRIATE STANDARDS AND REFBRBNOS GROUPS FOR STUDENTS 
INSTRUCTED IN FRENCH DOfERSION PROGRAMS 



As discussed earlier in this report, test results must be compared with 
expectations before they can be interpreted meaningfully. This section will 
present some information that relates to identifying appropriate expectations 
for students instructed in French Immersion programs. 

An important question is whether students in French Immersion programs 
should be expected to achieve at the same levd as students in the regular 
proffram. One line of reasoning mi^t be that these students are dividing their 
acaaemic attentions among nve subjects (English language arts, French 
language arts, social studies, mathematics, and science) rather than'^four, and 
thus expectations should be less. Another line of reasoning mi^^t be that, since 
student participation in the program is at the parents' option, French 
Immeirsion students may come nrom a more supportive home environment on 
the average, and thus expectations for the group snoidd be higher. 

The actual performance of French Immersion students on achievement tests 
may shed some li^t on which expectations are reasonable. Unfortunately, 
before the 1989 achievement tests, students were not required to identify 
themselves by program, so analyses of achievement test data for earher years 
are generally tinavailable. 

Some information is accessible. Related data for grade 3 social studies (1988) 
and tor grade 6 ( 1988) and grade 9 (1986) language arts have been analyzed. 



Grade 3 Soaal Stadiea 1988 

In 1988, many ^ade 3 students in the French Immersion program in one large 
urban jurisdiction wrote the Grade 3 Social Studies AchievemefU Test, which 
was available only in English. In cooperation with that jurisdiction, the 
students were identified and their results were compared to other students in 
the jurisdiction. Total scores for students receiving instruction in French in 
French Immersion pro-ams were significantly hi^er tiian for students in 
other programs in the jurisdiction, and also hu^er than for students in the 
English language program in the same schools. The results are shown in table 
4-1. Interpreting these restdts must take into consideration that the students 
were instructed in French and tested in English. For this case, at least, there 
seems to be no reason to have lower expectations for the French Immersion 
program students. 
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Table 4^2 
Grade 3 Social Studies 1988 
Results for Students Receiving 
Instruction in Either French or English^ 



Test Scores Average Standard Number of 

GROUP (out of 60) Deviation Studenta 

Students Instructed in 
French in French 

Immersion Programs 34.8 7.7 264 

Students Instructed in 
English in Schools 

offering French Immersion 33.1 8.6 332 

Students Instructed in 
English in Schools NOT 

offering French Immersion 31.1*** 9.0 3996 



***Differenc€S between the last group's average score (31.1) and the first 
two groups' average score (33,1, 34,8) are statistically sign^ant 
(p.<.00h J 
The difference between the first two groups is mil statistically 
significant. 

^All schools were in the same jurisdiction. 



Gruim a v^nrh ind ff-;,jf«^ T^^gii^y s Arbi 1888 

Tests designed to measure achievement in Grade 6 English Language Arts 
and Grade 6 French Language Arts were adminiatered ixTjime 1988. 
Student particic«ition in the French language arts test was at the option of 
the superintendent. A total of 1550 stuoents completed both tests. 

^ause writing the French language arts test was optional, and because 
the nature of the language education and background of students can vary 
neatly, caution should be exercised in interpreting the results. It is not 
known which students were Francophone and whitu were French 
Immersion, but the m^ority of the students were in French Immersion 
programs. 
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Each test consisted of two parts: a 50-question muitipie-ch >ice reading section 
and a written-response section scored on five catMoncal sciJes. Both sections 
'^ere scored imder th€r supervision of the Student Evaluation Branch. For the 
purpose of calculating & total score for the test, each section was wei^^ted 50 
per cent, giving total score* out of 100. 

It is of some interest to compare the results of these two tests and to compare 
the Ens^sh achievement scores of students wiio took the Fr ench test with the 
average for all those who took the English test. 

The correlation of total scores for the 1550 students who completed both testa 
was 0.623. Thin correlation indicates a fairlv hiA degree of relationship 
between the skills and knowledge measiured oy^e French test and those 
measured by the English test. The average score for students wiio wrote both 
testa was 6o.9 per cent on the Extglish test, compared to a provincial average of 
62.5 per cent. These resiilts show that students niio took language arts inoo^ 
languages achieved higher scores in English language artswan students who 
received instruction in English language arts onhr. It is not known, of course, 
whether the French Immersion students involved would have achieved higher 
or lower English language arts test scores if thev had been in a regular 
program. Students whose parents chose to place them in French Immersion 
programs may on average nave hi^er levels of ability than students in the 
regular program. There is, at any rate, no evid^^nce that the English language 
arts skills of students taking both courses are on the average less than i^r 
students taking only one language arts course. 

The higher scores on the English lan^age arts achievement test for students 
who wrote both tests were consistent in both the reading (multiple dioice) and 
writing sections of the test. Students who wrote both averaged 35.9 out of 50 
on the reading section and 16.5 out of 25 on the writing section, compared to 
the provincial results of 32.4 on the reading eection and 15.1 on the writing 
section. 

Students who wrote both tests averaged 65.7 per cent on the French test. 
However, there is no way to make valid comparisons between the results of the 
tests in the two languages. Although the design of the tests was much the 
same, the content was completely diflerent. 

The total multiple-choice scores on the two tests had a correlation coefficient of 
0.704, and the written-response scores had a correlation of 0.378, indicating 
that reading skills in the two languages, as measured by these tests, are more 
closely related than writing skills. 



In 1986, French and En^sh language arts tes^s were given to grade 9 
students. Although a smaller number of students inrote both the French and 
Exielish tests that year, the results are consistent v/ith the grade 6 results for 
1988. The correlation of total scores for the 615 students r^ho completed both 
testa was 0.480. The average score was 73.8 per cent on the English test, 
compared to a provincial averase of 64.0 per i:ont. Students who wrote both 
tests averaged 62.8 per cent on the French test. 
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The total multiple-choice scores on the two tests had a correlation coefficient of 
0.593, and the written-resoonse scores had a correlation of 0.260, supporting 
again the interpretation tnat reading skills in the two languages are more 
closely related than writing skills. 

French Immersion students performed very well when tested in English 
Lai^&ge Arts. This may simply be because they represent a select group of 
students. Clearly, receiving instruction in French has not seriously jeopardized 
their skills and understandmgs in language arts. What we cannot tell nrom this 
information is how well these students would have performed had Uxey received 
instruction only in English. 



ACHlEVEBfENT TE97 RESULTS 1969 



On the 1989 achievement tests, students were required to identify their 
program as English, Francophone, French Inunersion, or other. Thus, it was 
possible to perform analyses based on language of testizig and program. 



Hraii^ a Rnylifth T^nyiipy p Arts 1989 

Superintendents decided whether grade 3 students in French Immersion 
programs would write the English Language Arts Achievement Test. Table 4-2 
shows that about 70 per cent of French Immersion program students in grade 3 
wrote the test. 



Table 4 2 
Grade 3 English Language Arts 1989 
Number of French Immersion Students 



Program Number of Percentage of 

Students Students 



French Immersion 

Total Population 2805 100.0 

Tested iiu English) 1960 69.9 

Not Tested 845 30.1 
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Grmdefi SodiJ Stadia 1989 

Writing the Grade 6 Social Studies Achievement Te$t was also optional for 
French Immersion program clasaes. Hie teat was available in Enj^ish and in 
French translation. Table 4-3 shows the participation percentages. 



Table 4-3 
Grade 6 Social Studies 1989 
Number of French Immersion Students 



Program Number of Percentage of 

Students Students 



French Immeroion 



Total Popvilation 


1872 


100.0 


Tested in French 


931 


49.8 


Tested in English 


77 


4.1 


Participated in 
Special Study 


437 


23.3 


Not Tested 


427 


22.8 



The results for French Immersion students are <T:;iipIicated by the fact that 
23.3 jper cent participated in the special language of testing study. Of the 
remaining students, the majority wrote in French. About 4 per cent of the total 
wrote in English, and nearly 23 per cent did not write. 
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Table 4-4 gives the participation rates for the Grade 9 Science Achievement 
Test. Most French Inunersion students wrote the French translation. 



Table 4^ 
Grade 9 Science 1989 
Number of French Immersion Students 



Program Number of Percentage of 

Students Students 

French Immersion 

Total Population 1097 100.0 

Tested in French 946 86.2 

Tested in English 64 4.9 

Not Tested 97 8.9 



Tables 4-2, 4-3, and 4-4 reflect the difficulty in identifying any reference groups 
for appropriate norms. Participation rates are consistently lower than for the 
regular program. In no case can studerts be considered a representative 
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T&ble 4-5 presents the achievement test results for students in French 
Immersion and English language programs. As well, results are shown for 
the different test forms (English or French) that were written. 



TabU4^5 

Number of Students and Average Scores 1989 



Grade/Subject 



Grade 3 E nglish Language Arts 
Number of Students 

Total Test Score 
(Maximum Possible = 100) 

Part A: Writing Score 
(Maximum Possible = 25) 

Part B: Reading Score 
(Maximum Possible = 40) 



French Immersion 

Program 
Tested m Tested in 
French English 



none 



1960 
69.9 

16.5 

29.5 



English Language 
Program 
Tested in 
EngUsh 



31998 
68.9 

16.2 

29.1 



Grade 6 Social Studies 
Number of Students 

Total Test Score 
(Maximum Possible = 100) 

Part A: Multiple-Choice Score 
(Maximum Possible = 50) 

Part B: Written-Response Score 
(Maximum PosBible = 30) 



931 77 29 918 

55.3 C9.2 62.5 

27.9 35.6 32.2 

16.3 19.5 17.5 



Grade 9 Science 

Number of Students 946 54 27 137 

Total Test (Grcde 9 Science 50.1 52.0 50.1 

Achievement Test consists of 
multiple^hoice questions only 
with a maximum possible raw 
score of 76.) 
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As the language of testing study and the preceding diacuMion indicate, 
comparisons of the average scores among groups are of low validity. Results for 
schools and jurisdictions c$n in some cases be usefully compared. The regular 
program norms are probabity thcs nost appropriate ibr schools and jurisdictions 
evaluating Grade 3 En^dish Language Arts results. For students in the other 
grades, the results for tnose students who wrote in En^ish can be compared to 
regular program students who wrote in En^isk A jurisdiction's French 
Immersion students who wrote in French can best be compared to all French 
Immersion students ^o wrote in French. Caution is necessary for all these 
comparisons, however. 

This situation, with limited possibilities for comparison of test scores to norms, 
will persist as long as dearfy defined reference sroups are not available. The 
absence of an adequate data base remains problematic. In 1989, standards 
were not set for students instructed in French. This ii due in part to the 
optional participation of students in the Frendi assessment component of the 
program. The representativeness of this group cannot be assurea. Evaluation 
of provincial levels of achievement must await the establishment of provincial 
standards for these testa and prograins. 



.26- 

27 



CONCLUSIONS AND FUTURE DBRBCTIONS 



The purpose of tfcis report if to focus attention on issues related to the 
assessment of French Immersion program student achievement. Issues have 
been described and trends in Frenw Immersion achievement testinff data have 
been disnt^sed in the hope that such an analjrsis will provide insi^pnt into how 
the different issues can best be resolved. 

For French Immersion students, the results of the 1989 langua^ge of testing 
stud^ suggest that when it comes to the asaessment of grades 3 and 6 social 
studies aobievement, it matters in which language student achievement is 
measured: scores for French Immersion students who wrote in Frendi are 
depressed relative to those achieved by students who wrote in English. \V^t is 
not clear from the data is whether this pattern occurs across aU grade levels 
and subject areas. When the scores of French Immersion students who vnrote 
the 1989 Grade 9 Science Achievement Test in Frendi are compared to the 
scores achieved by students wrote the test in English, the size of tne difference 
in average scores of the two groups is less than that which is present in the 
grade 6 language of testing study. This could mean that the language of testing 
effect is smaller in science than in social studies or that the effect is smaller at 
the grade 9 level. On the other hand, since the groups who wrote ffrade 9 
science in French and English were self-selecting, it could simply be uiat the 
same language of testing effect exists in grade 9 science but that this effect is 
nullified by possible group differences in aoility. 

What is clear from the data is that the language of testing issue is important 
and deserves further study. It is also clear that the deasion to test rVench 
Immersion program students in one language rather thm another must take 
into account possible language of testing effects and how these effects can be 
controlled or accounted for when interpreting test data. In other words, m any 
data about French Immersion students to be useful and meaningful, aecisions 
about the language of testing must be based on more than just political 
considerations. They mtist also consider what is statistically valid and reliable. 

When considering appropriate standards and norms against which to compare 
student achievement in French Immersion programs, the language of testing 
results are also of interest. What can be concluded from that study is that 
where there is a depression of scores as a restilt of the language of testing, it is 
inaippropriate to compare the results of the group writing m that language 
against a test score that has been chosen to represen. ^he standard on the test 
of the other language. It appears to be equsuiy inappropriate to compare the 
scores of ffroups writing in French with those achieved by students writing in 
English where there is a language of testing effect, imless that effect has been 
accounted for in some way. 
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The 1989 grade 3 English langu^.^e arts results for French Immersion students 
and the 1989 grade 6 social studies and 1989 grade 9 sdenee results for French 
Immersion students who wrote the tests in English are aU equal to or above the 
English language program provincial averages. Ilieie remits su|:gest that the 
French Immersion students who were tasted had levels of peri^mance that 
were as ffood as or better than those achieved provineially. from this it seems 
reasonable to infer that it is appropriate to expect at least the same levels of 
academic performance from students in Frcoadi Immersion programs as are 
expected of students in the En^^ish language programs. 

From the language of testing study it was not possible to determine why French 
Immersion students adiieved lower scores when thQr wrote in ?Vencn rather 
than in English. It has been hypothesised that two variables co^ild account for 
such a difference. These two variables include the nature of Hit English and 
French forms of the test and the first and second language reacting abilities of 
the examinees. Further injformation about the effect of these variables on 
student responses to test ^estions must be acquired before a decision can be 
made about the appropriateness of usinff French translations of English 
program achievement tests to assess Frencn Immersion student achievement. 
Sucn a decision will also depend on whether or not it seems appropriate to 
apply the same expectations to French Immersion program students as are 
appked to English program students. 

Whether or not testing should be mandatory for French Immersion program 
students will denend in part on how successfully some of these otiier issues can 
be resolved. Tne same caveat applies to the issue of n^ether or not the 
achievement of students instructed in French Immersion programs should be 
assessed and reported on a provincial basis. For these reasons, it seems 
important that further study and discussion of the issues and problems 
identified in this report should occur. 

Student Evaluation staff are addressing the issues raised in this report in two 
ways in 1990. Eixijt> a plan for ensuring a fuU discussion of the issues 
resented is being prepared. A cross section of educators involved in the 
elivery of French Immersion programs will be involved in these discussions. 
Sfi£Qnd» a follow-up language of testing study at the grades 3 and 6 levels in 
1990 is being planned. As well, grade 9 students wUl write the French and 
English Language Arts achievement tests. Information collected from these 
activities will assist in further resolving the issues presented. Other activities 
may be planned following in-depth oiscussions with educators in French 
Immersion programs. 
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